parzen estimator
Tree-Structured Parzen Estimator Can Solve Black-Box Combinatorial Optimization More Efficiently
Abe, Kenshin, Wang, Yunzhuo, Watanabe, Shuhei
Tree-structured Parzen estimator (TPE) is a versatile hyperparameter optimization (HPO) method supported by popular HPO tools. Since these HPO tools have been developed in line with the trend of deep learning (DL), the problem setups often used in the DL domain have been discussed for TPE such as multi-objective optimization and multi-fidelity optimization. However, the practical applications of HPO are not limited to DL, and black-box combinatorial optimization is actively utilized in some domains, e.g., chemistry and biology. As combinatorial optimization has been an untouched, yet very important, topic in TPE, we propose an efficient combinatorial optimization algorithm for TPE. In this paper, we first generalize the categorical kernel with the numerical kernel in TPE, enabling us to introduce a distance structure to the categorical kernel. Then we discuss modifications for the newly developed kernel to handle a large combinatorial search space. These modifications reduce the time complexity of the kernel calculation with respect to the size of a combinatorial search space. In the experiments using synthetic problems, we verified that our proposed method identifies better solutions with fewer evaluations than the original TPE. Our algorithm is available in Optuna, an open-source framework for HPO.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Derivation of Output Correlation Inferences for Multi-Output (aka Multi-Task) Gaussian Process
Gaussian process (GP) is arguably one of the most widely used machine learning algorithms in practice. One of its prominent applications is Bayesian optimization (BO). Although the vanilla GP itself is already a powerful tool for BO, it is often beneficial to be able to consider the dependencies of multiple outputs. To do so, Multi-task GP (MTGP) is formulated, but it is not trivial to fully understand the derivations of its formulations and their gradients from the previous literature. This paper serves friendly derivations of the MTGP formulations and their gradients.
A Risk Minimization Principle for a Class of Parzen Estimators
This paper explores the use of a Maximal Average Margin (MAM) optimality principle for the design of learning algorithms. It is shown that the application of this risk minimization principle results in a class of (computationally) simple learning machines similar to the classical Parzen window classifier. A direct relation with the Rademacher complexities is established, as such facilitating analysis and providing a notion of certainty of prediction. This analysis is related to Support Vector Machines by means of a margin transformation. The power of the MAM principle is illustrated further by application to ordinal regression tasks, resulting in an O(n) algorithm able to process large datasets in reasonable time.
Building a Tree-Structured Parzen Estimator from Scratch (Kind Of)
The way a machine learning model fits itself to data is governed by a set of initial conditions called hyperparameters. Hyperparameters help to restrict the learning behavior of a model so that it will (hopefully) be able to fit the data well and within a reasonable amount of time. Finding the best set of hyperparameters (often called "tuning") is one of the most important and time consuming parts of the modeling task. Historical approaches to hyperparameter tuning involve either a brute force or random search over a grid of hyperparameter combinations called Grid Search and Random Search, respectively. Although popular, Grid and Random Search methods lack any way of converging to a decent set of hyperparameters -- that is, they are purely trial and error.
Auto-Surprise: An Automated Recommender-System (AutoRecSys) Library with Tree of Parzens Estimator (TPE) Optimization
We introduce Auto-Surprise, an Automated Recommender System library. Auto-Surprise is an extension of the Surprise recommender system library and eases the algorithm selection and configuration process. Compared to out-of-the-box Surprise library, Auto-Surprise performs better when evaluated with MovieLens, Book Crossing and Jester Datasets. It may also result in the selection of an algorithm with significantly lower runtime. Compared to Surprise's grid search, Auto-Surprise performs equally well or slightly better in terms of RMSE, and is notably faster in finding the optimum hyperparameters.
- Europe > Ireland > Leinster > County Dublin > Dublin (0.14)
- South America > Brazil (0.04)
- North America > United States > New York > New York County > New York City (0.04)
A Risk Minimization Principle for a Class of Parzen Estimators
Pelckmans, Kristiaan, Suykens, Johan, Moor, Bart D.
This paper explores the use of a Maximal Average Margin (MAM) optimality principle for the design of learning algorithms. It is shown that the application of this risk minimization principle results in a class of (computationally) simple learning machines similar to the classical Parzen window classifier. A direct relation with the Rademacher complexities is established, as such facilitating analysis and providing a notion of certainty of prediction. This analysis is related to Support Vector Machines by means of a margin transformation. The power of the MAM principle is illustrated further by application to ordinal regression tasks, resulting in an $O(n)$ algorithm able to process large datasets in reasonable time. Papers published at the Neural Information Processing Systems Conference.